Search CORE

107 research outputs found

Automatic Search Intervals for the Smoothing Parameter in Penalized Splines

Author: Cao Jiguo
Li Zheyuan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 08/12/2022
Field of study

The selection of smoothing parameter is central to the estimation of penalized splines. The best value of the smoothing parameter is often the one that optimizes a smoothness selection criterion, such as generalized cross-validation error (GCV) and restricted likelihood (REML). To correctly identify the global optimum rather than being trapped in an undesired local optimum, grid search is recommended for optimization. Unfortunately, the grid search method requires a pre-specified search interval that contains the unknown global optimum, yet no guideline is available for providing this interval. As a result, practitioners have to find it by trial and error. To overcome such difficulty, we develop novel algorithms to automatically find this interval. Our automatic search interval has four advantages. (i) It specifies a smoothing parameter range where the associated penalized least squares problem is numerically solvable. (ii) It is criterion-independent so that different criteria, such as GCV and REML, can be explored on the same parameter range. (iii) It is sufficiently wide to contain the global optimum of any criterion, so that for example, the global minimum of GCV and the global maximum of REML can both be identified. (iv) It is computationally cheap compared with the grid search itself, carrying no extra computational burden in practice. Our method is ready to use through our recently developed R package gps (>= version 1.1). It may be embedded in more advanced statistical modeling methods that rely on penalized splines.Comment: R code is available at https://github.com/ZheyuanLi/gps-vignettes/blob/main/gps2.pd

arXiv.org e-Print Archive

Faster model matrix crossproducts for large generalized linear models with discretized covariates

Author: Li Zheyuan
Wood Simon N
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/03/2019
Field of study

Explore Bristol Research

Generalized Additive Models for Large datasets: spatial-temporal modelling of the UK's Daily Black Smoke (1961 - 2005)

Author: Li Zheyuan
Publication venue
Publication date: 13/02/2019
Field of study

OPUS

Exploring the Cognitive Knowledge Structure of Large Language Models: An Educational Diagnostic Assessment Approach

Author: Hou Lei
Li Juanzi
Yu Jifan
Zhang Zheyuan
Publication venue
Publication date: 18/10/2023
Field of study

Large Language Models (LLMs) have not only exhibited exceptional performance across various tasks, but also demonstrated sparks of intelligence. Recent studies have focused on assessing their capabilities on human exams and revealed their impressive competence in different domains. However, cognitive research on the overall knowledge structure of LLMs is still lacking. In this paper, based on educational diagnostic assessment method, we conduct an evaluation using MoocRadar, a meticulously annotated human test dataset based on Bloom Taxonomy. We aim to reveal the knowledge structures of LLMs and gain insights of their cognitive capabilities. This research emphasizes the significance of investigating LLMs' knowledge and understanding the disparate cognitive patterns of LLMs. By shedding light on models' knowledge, researchers can advance development and utilization of LLMs in a more informed and effective manner.Comment: Findings of EMNLP 2023 (Short Paper

arXiv.org e-Print Archive

A physical neural network training approach toward multi-plane light conversion design

Author: Doerr Joe H.
Li Guifang
Pang Shuo
Zhu Zheyuan
Publication venue
Publication date: 05/04/2023
Field of study

Multi-plane light converter (MPLC) designs supporting hundreds of modes are attractive in high-throughput optical communications. These photonic structures typically comprise >10 phase masks in free space, with millions of independent design parameters. Conventional MPLC design using wavefront matching updates one mask at a time while fixing the rest. Here we construct a physical neural network (PNN) to model the light propagation and phase modulation in MPLC, providing access to the entire parameter set for optimization, including not only profiles of the phase masks and the distances between them. PNN training supports flexible optimization sequences and is a superset of existing MPLC design methods. In addition, our method allows tuning of hyperparameters of PNN training such as learning rate and batch size. Because PNN-based MPLC is found to be insensitive to the number of input and target modes in each training step, we have demonstrated a high-order MPLC design (45 modes) using mini batches that fit into the available computing resources.Comment: Draft for submission to Optics Expres

arXiv.org e-Print Archive

SUIT: Learning Significance-guided Information for 3D Temporal Detection

Author: Lu Jiachen
Xu Hang
Zeng Yihan
Zhang Li
Zhou Zheyuan
Publication venue
Publication date: 04/07/2023
Field of study

3D object detection from LiDAR point cloud is of critical importance for autonomous driving and robotics. While sequential point cloud has the potential to enhance 3D perception through temporal information, utilizing these temporal features effectively and efficiently remains a challenging problem. Based on the observation that the foreground information is sparsely distributed in LiDAR scenes, we believe sufficient knowledge can be provided by sparse format rather than dense maps. To this end, we propose to learn Significance-gUided Information for 3D Temporal detection (SUIT), which simplifies temporal information as sparse features for information fusion across frames. Specifically, we first introduce a significant sampling mechanism that extracts information-rich yet sparse features based on predicted object centroids. On top of that, we present an explicit geometric transformation learning technique, which learns the object-centric transformations among sparse features across frames. We evaluate our method on large-scale nuScenes and Waymo dataset, where our SUIT not only significantly reduces the memory and computation cost of temporal fusion, but also performs well over the state-of-the-art baselines.Comment: Accepted to IROS 202

arXiv.org e-Print Archive